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The extraction of association rules is a very attractive data mining task and 
the most widespread in the business world and in modern society, trying to 
obtain the interesting relationship and connection between collections of 
articles, products or items in high transactional databases. The immense 
quantity of association rules obtained expresses the main obstacle that a 
decision maker can handle. Consequently, in order to establish the most 
interesting association rules, several interestingness measures have been 
introduced. Currently, there is no optimal measure that can be chosen to 
judge the selected association rules. To avoid this problem we suggest to 
apply ELECTRE method one of the multi-criteria decision making, taking 
into consideration a formal study of measures of interest according to 
structural properties, and intending to find a good compromise and select the 
most interesting association rules without eliminating any measures. 
Experiments conducted on reference data sets show a significant 
improvement in the performance of the proposed strategy. 
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Nomenclature and abréviations 


Nomenclatures Abbreviations 
Ai Alternative i AR Association rules 
Aij The performance of Aj against Cj. CONF The confidence 
c* The concordance threshold COS The Cosinus 
Ci The criteria i SUP The support 
Cik The concordance index for pair of Ai and Ak CONF The confidence 
d* The discordance threshold DEA Data Envelopment Analysis 
D Transactional database DM Data mining 
Dik The discordance index for pair of Ai and Ak ECR Example end counter example rate 
I The set of all items IG The information gain 
Mj The interestingness measure j JRD The Jacard 
Ri Association rule i MCDA Multi-Criteria Decision Analysis 
T The set of transactions PS Piatetsky Shapiro 
Wi The weight of criteria j KDD Knowledge discovery in databases 
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1. INTRODUCTION 

Knowledge discovery in databases (KDD) is a new discipline with the vocation to extract 
information hidden in large amounts of data that can be useful to users in decision-making processes, 
information management, planning, research, process control and management, and query optimization. The 
KDD is a multi-step process starting from the pre-selection and preparation of data to the interpretation of 
results, including the central phase of data mining. Data mining is a primary field of computer research, it is 
generally used in different application areas such as business (insurance, commerce, finance), scientific 
studies (astronomy, medicine), and government security (discovery of criminals and terrorists). Association 
rules is one of the most important tasks DM, it aims to identify and discover interesting and useful models 
and relationships in massive amounts of data. An association rule is an implication representation designed 
as, where A and B are disjointed elements. The potency of an association rule can be judged according to its 
support and confidence. 

The majority of existing association rule algorithms [1], based on support and confidence, will build 
a large number of rules. As a result, users and the decision-maker are unable to determine the most 
impressive and, consequently, they are unable to make decisions. To overcome this problem, several 
measures have been proposed in the literature to discover the best rules [2-3]. Nevertheless, the abundance of 
these measures in the literature has created an additional obstacle, which is the choice of measures that 
adequately satisfy the users. 

Many studies aim to assist the operator in the choice of the measure the most appropriate to the 
scope of the decision. In some studies, the order of the rules provided by the measures of interest for this 
return by human experts is analyzed and the measure that gives the closest ranking of the experts is kept. 
Nevertheless, their findings cannot be considered a general conclusion. Furthermore, it is not always possible 
to obtain the expert's ranking. Another techniques and strategies have been introduced by providing a set of 
criteria for designing measures of interest [4], or by examining the resemblance and the similarity between 
measures to rank them [5]. Vaillant et al. [6] offered to derive a preorder on set of measures and identifying 
the clusters and groups of measures. Toloo et al. [7] suggested an proposal to evaluate and classify the 
performance of association rules using several criteria through a non-parametric data development analysis 
(DEA) procedure. The identical as Toloo, S.Shukla et al. [8] proposed a novel model for prioritizing 
association rules produced from the data mining and taking into account the preference and desirability of the 
decision maker for different criteria. At the same time, other authors identify useful rules without privileging 
or rejecting any measure by using the concept of sky model and dominance between the rules [9]. Our 
preceding work [10] has proposed to discover meaningful and pertinent rules by simultaneously adopting the 
notion of dominance between rules and algorithm genetics. This paper extends into this context; we introduce 
a method based on ELECTRE method, which enables to select the association rules without privileging or 
removing any measure. 

The document is organized as following. In the second section, we present the required scientific 
background and an overview of the association rules, the measures of interest and the ELECTRE method and 
structural properties. In the third section, we present our approach based on the ELECTRE method. In fourth 
section, we will examine the experimental results and its analysis; the conclusion and scope of future 
research are presented in the final section. 


2. BACKGROUND 
2.1. Association rules mining 

The ARM present an effective technique of studying very large binary data sets. Association rules 
constitute an effective technique of analyzing massive binary data sets. A current use is to uncover the 
relationships among binary variables in transaction datasets, and this kind of examination is referred to as 
"market basket analysis". 


Let [ = aoe a be a set of all items, association rules are extracted over a huge set of 


transactions, denoted by TJ’ with T = itch.) every transaction f, is an itemset and meet f, C I. 


Given a non-empty set I, an association rule is a statement of the form X —Y, where X,Y CJ such that 


X \Y =©. The set X is called the antecedent of the rule; the set Y is called the consequent of the rule. 

An association rule can be considered interesting if the elements are often at play together and there 
are suggestions that one set might, in some sense, lead to the presence of the other set. The power of an 
association rule can be estimated by applying mathematical material concepts called "support" and 
“confidence”. 
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n(X UY) 


Support(X > Y) = P(X,Y)=—-——._ Confidence(X > Y) = P(X,Y) _n(X VY) 
n 


P(X) n(X) 


Where n(X U Y) is the number of transactions that contain items (i.e X UY ) of the rule n(X) is 


the number of transactions containing itemset X and n is the total number of transactions. To find significant 
association rules from the given database, the support and the confidence of the rule should persuade 
thresholds predefined by the users called minimum support “ Minsup” and a confidence threshold named 
minimum confidence “ Minconf”. 


2.2. Interestingness measures 

ARM can produce a huge number of rules, most of which are not attractive to the decision maker 
and the user. IM performs a crucial role in DM, they are employed to detect the really interesting rules and to 
choose and establish items and patterns according to their potential benefit to the user. These measures may 
be divided broadly into two categories: objective measures (data-driven) based on the statistical strengths or 
characteristics of the discovered rules, and subjective measures e.g. unexpectedness and action ability [11] 
(user-driven) which are obtained from the user views, beliefs or expectations of their particular problem 
domain. 

Support, confidence, and lift are the most popularly used objective measures to decide relevant 
rules. In addition to these measures, there are several additional objective measures offered by Tan et al. [12], 
such as: phi-coefficient, ods ratio measures, kappa measure, mutual information, jmeasure, gini-index, 
laplace measure, conviction measure, interest measure, and cosine measure. Their research confirms that 
several measures have different fundamental properties and classifies them from several viewpoints, and 
compares their characteristics, and identifies their roles in the DM process, and provides procedures for 
choosing suitable measures for applications and assume that there is no one is better than others in all 
employment domains. 

Liu et al. [13] Examine the obtained AR using the users designations to identify those relevant and 
interesting ones for the user and finds the most relevant if they are unexpected (conflicting user's conviction) 
or provide strategic information on which the user can influence. 

There are other authors [14] who identify interesting rules using a new methodology to merge data- 
based (objective) and user-oriented (subjective) measures of evaluation. Their design is that objective 
measures are first used to screen the set of rules, then subjective measures are used to help the user analyze 
the rules according to their understanding and objectives. 

Paul Razan [15] utilized a semantic IM for discovering AR. Semantic IM take into account how data 
attributes are semantically associated. It uses the construction of the ontology that receives the corresponding 
items (e.g. specialization, generalization etc.). Due to the great number of IM existing in the literature, how to 
select the proper one becomes a serious challenge. To defeat this challenge, various techniques and methods 
were introduced, by offering some formal intuitive criteria that a good measure should check to assess the 
level of interest of the rule [16]. Tan et al. [12] discuss the properties of a set of measures and assumes that 
there is no one is better than others in all areas of application. Selected objective IM presented in Table 1 and 
used to assess the performance or value of the rules. 


Table 1. Some interestingness measures 


Measures Formula Measures Formula 
Lift a PCXY Pearl 
; Lift(X >¥) earice earl PRL(X > Y)=P(X)|P(v/x)-P(¥)| 
Information Gain P(XY Loevinger 
GI(x >Y)=log, ( ) r(~)-roo 
“P(X)P(Y) LVG Cx > Y)= x GP) 
1—PCy 
Example & 1 Conviction P(X)PQ) 
Counter Example ECR(X >Y)=2 , CNV = saa) 
Rate conf (x > Y) P(xY) 
Jaccard P(xyY) Zhang p(xy)-P(x)P(y) 
TENS Ser) ae @) zn (x >) = rs z 
max { P( xv) P(Y), p(y) P(x )} 
Cosinus P(xY) Piatetsky 


cos (xX >Y)= ps(x > Y)=P(xy)-P(x) Py”) 


P(X)P(’) Shapiro 
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2.3. Properties of a measure 

The formal study of measures of interest according to formal properties consists in proposing a set 
of formal properties of measures that have been treated by several works in the literature according to which 
the most appropriate measures are chosen according to the user's needs. We will now synthesize and 
formulate a set of measurement properties proposed in the literature, in order to take a general idea on the 
different existing properties. 


Property 1: The value of the measurement must be zero M (X — Y) =O, when X and Y are independent 
i.e. P(xy) = Px.Py. 

Property 2: The measurement M (X — Y) must be monotonously increased as a function of P(xy) when 
the size of the premise 71, and the size of the conclusion n y remain constant. 

Property 3: The measurement M(X —Y) must be reduced according to the size of the premise 7, or 


according to the size of the conclusion n , when the other parameters (n, N., n,) remain the same. 


Property 4: The measurement is antisymmetric under the column or row permutation operation. 
Property 5: The measurement M(X — Y) must verify the following relationship: 


For any rule, X — Y we should have: M(X > Y)= M(X >Y) 
Property 6: The measurement M(X — Y) must verify the following relationship: 


For any rule, X —> Y we should have: M(X > Y)=-M(X > Y) 


Property 7: Desired relationship between rules X — Y and ¥y > xX , the measurement M must verify the 
relationship: M(X > Y)= M(Y > X) 


Property 8: The measurement M(X — Y) must be invariant when the size n of the learning set T increases 


and all other numbers (n ny and Nyy ) remain constant. 


xX? 
Property 9: The concrete meaning of the measurement or the understandability of the measurement, i.e., the 
measurement must be intelligible and easy to interpret by the user to be able to communicate and explain the 
results obtained. 


Property 10: The measure must make it possible to distinguish between, X — Y and X —+Y the 
examples of one being the counter-examples of the other. 


Property 11: A measure must evaluateX — Yet Y— xX _ in the same way in the case of logical 
implication. 
Measure with a fixed value in case of logical involvement i.e. 


if Ibe RV X > Y wehave P(Y/X)=1=>m(X >Y )=b 


Property 12: Setting a threshold, Easy to set a threshold for acceptance of the rule [15]. It is preferable that 
the measure lends itself well to the determination of the acceptance threshold of the rule because this allows 
the interesting rules to be retained without having to classify them. 

Property 13: The measurement must have a fixed value in the case of equilibrium i.e. in the case where the 
number of examples and the number of counter-examples are identical. 

Property 14: Measurement must be robust, i.e., the measurement of a rule must be resistant to database 
disturbances due to a typo during database creation or a value that is missing in the data. 

There are several properties in the literature and this translates the problem into a description of 
many key criteria and properties and the structural conditions that need to be verified by the measures of 
interest in order to choose the right measure for a given application area. However, these approaches do not 
guarantee the selection of the appropriate and best measure for the simple reason that this measure does not 
verify such a property used. 


2.4. ELECTRE method 

Multi-criteria decision analysis (MCDA) [17] is a common structure for holding difficult decision- 
making situations with often and several conflicting goals and objectives that organization groups and/or 
decision-makers value differently. Many MCDA techniques have been perfected over the years and 
implemented to decision problems in different areas. Among the popular research area within MCDA we find 
the outranking approach, and in particular ELECTRE methods [18]. MCDA aims to provide decision-makers 
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or data analysts with a set of tools to help them resolve the problem of decision making, among several points 
of view often considered contradictory. 

There are many ways to classify the different existing MCDA methods. The best-known 
classification is that adopted by Roy [19]. They classify MCDA methods into three main families: 
— Value measurement models (aggregation method). 
— Purpose, aspiration or reference level models. 
—  Outranking Models. 


Allow a MCDA problem among m criteria and n alternatives. Let C,,...,C,, and Ay A, 
denote the criteria and alternatives, respectively. The value ai; describes the performance of alternative A j 
against criterion C,. We assume that a higher rate value indicates a better achievement for any object of 
minimization can be clearly converted into an object of maximization. We assign to each criteria C,a 


positive weight w. , it indicates the corresponding effect of criteria Ci. 


ELECTRE method derived from the Outranking family, it intends to obtain all the alternatives that 
dominate other options and they cannot be dominated by any other choice. ELECTRE [18] (Elimination Et 
Choix Traduisant la Réalité) is one of the MCDA methods and this method permits decision makers to select 
the best choice with most advantage and least conflict in the function of different criteria. We use the 
ELECTRE to choose the suitable choice from a set of actions. Among the simplest method of ELECTRE 
family, we find ELECTRE1. 

The next step is to decide on the desirable choice taking into account the advantages of each 
alternative over each criterion (in the form of a decision matrix m x n) and the corresponding weights of the 
criteria established by the decision-maker. 


For creating the favourite knowledge among each pair of alternatives, such as A, and A, 


(i, k =,..., m) , ELECTRE utilizes the term of outranking relations. The alternative outranks if on a great 


part of the criteria performs at least as good as (concordance form), while its poor efficiency is still 
satisfactory on the other criteria (non-discordance condition). After determining for each pair of alternatives 
whether one alternative overclasses another, these pair upgrading estimates can be combined into a partial or 
total ranking. The outranking family intends to discover all options that dominate other options and they 


cannot be dominated by any other alternative. Each criterion is attributed a subjective weight w, by the 


N 

decision-maker, where: >, =1. The ELECTRE method is based on the quotient of concordance and 
i=l 

discordance discordance described as follows. We first check the data from the decision table and verify here 

that the sum of the weights of all criteria matches 1. 


The concordance index c, for each and every pair of alternatives A, and A, ,iukK=1,..... 77) 


(remark that an alternative is not compared to itself) is established as the sum of all the weights for those 
criteria where the execution rate of Ai is least as high as that of Ak, i.e. 


C= > W, Jik =1,....,5 J #k. The concordance score extends between 0 and 1. 


hay 2diz 


Likewise, The calculation of the discordance index d jk for each criterion where A, exceeds A, is 


described as the ratio between the difference in execution level between A, and A, and the maximum 
difference in level on the criterion attended between any pair of alternatives. The maximum of these scores 


(which want to endure between 0 and 1) is the discordance index, 1.e.: d i =0 if a > Aix ji =1,...,m, ice. 


the discordance index is zero if A, execute better than A, on all criteria. Otherwise, 


fit ie — ‘De 
[ae ax = P »J,K =1,...,N, JF 
i=l,..m max a..— MIN Q.. 
jal Y  jahiun Y 
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Then, an overall concordance threshold, c, and an overall discordance threshold, d, are identified to 
provide the overall concordance and discordance scoring analyses. The higher the threshold values, the more 
challenging it is to succeed in the examinations (Generally, c = 0.7 and d = 0.3 [20]. For an outranking 
relationship to be inferred as right, the two aggregate records must not violate their corresponding thresholds. 


That is C, = cand D, <d ", Once the two tests are completed for all pairs of alternatives, the best 


alternatives are those that outrank more than being outranked. By building such a relation among each and 
every pair of alternatives, one can then remove the dominated alternatives and achieve the non-dominated 
solutions. 

A partial ranking of an outranking family could not provide the best alternative immediately. A 
subset of propositions can be defined such that at least one member of the subset outranks any proposition 
not in the subset. The goal is to make this subset smaller. This subset of propositions can be supposed as a 
shortlist, inside which a good compromise statement should be obtained by additional methods or 
considerations. 


3. THE PROPOSED APPROACH 

Before the immense amount of produced rules through association rules mining method, applying 
Apriori [1, 21], close, close+ [22] or charm [23], etc...; Therefore it may be hard to select valuable knowledge 
from them, and we risk to waste information. In this context, we suggest to employ multi-criteria decision 
analysis (MCDA) ELECTRE method to obtain a good compromise without eliminating or benefiting any 
measures, which allow choosing the most interesting association rules. 

After ARM from a transactional database D, lets R a set of AR extracted by Apriori, and M a set of 
measures to evaluate the rules. So we take the set of rules as alternatives and a set of measures as criteria to 
transform decision table. 

Let two association rules. A true outranking relation of, implies that is preferred to. We say that an 
association rule outranks another association rule if only if is at least as good as on a majority of criteria and 
if it is not significantly worse on any other criteria, (i.e., the distinction between the two are inside a pre- 
defined threshold). 

We calculate the concordance index and discordance index for each and every pair of rules and to 
build an outranking relation, both global indices should satisfy their correspondent threshold. And the 
preferred association rules are those that outrank more than being outranked. 

The main idea of this contribution is to apply the ELECTRE method to find the best association 
rules, for this purpose, measures are taken as attributes and association rules as alternatives, which makes it 
possible to create the decision matrix. The second contribution in this work is to take into consideration the 
formal study of measures of interest according to structural properties. This is why they are integrated into 
the ELECTRE method at the weight level. The weight of the measurements is taken as the number of 
properties verified by the measure of interest. In this point, our work is supported by the advantage of using 
formal properties to decide which is the right measure [12, 24-25]. 


4. RESULTS AND DISCUSSION 

In this part, we will examine and demonstrate the advantages of the suggested method. Firstly, we 
generate AR utilizing Apriori [1] from a set of famous datasets acquired from UCI machine learning 
repository [26] a (mushroom (Mus), flarel (F1), flare2 (F2), monks! (M1), monks2 (M2), monks3 (M3), 
zoo (Z)). The Table 2 abstracts the properties of the related datasets and gives the minimum support used for 
each dataset chosen and the number of rules obtained from the different datasets applying Apriori algorithm. 


Table 2. Characteristics of the used datasets number of AR generated for each dataset 


Data set Items Transactions Minsup Number of rules generated 
Mushroom 22 8124 40 2654 
Flarel 32 323 20 3468 
Flare2 32 1066 20 3342 
Monks1 19 432 20 3564 
Monks2 19 432 2) 2422 
Monks3 19 432 5 2516 
Zoo 28 101 5 2554 
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As described in the section (II-B), to judge association rules, we apply a collection of interestingness 
measures. The measures employed for the executed test are: support (SUP), confidence (CONF), lift, cosinus 
(COS), information-gain (IG), piatetsky-shapiro (PS), jacard (JRD) and example\& counter example rate 
(ECR). These measures are calculated utilizing the equations indicated in Table 1, and we take as weigh of 
each measure the number of properties verified by the selected measure. 

Now we apply the ELECTRE1 algorithm to select the most interesting association rules using multi 
criteria. The Table 3, display the obtained results from the rules produced through Apriori for each dataset 
(MUS, Z, M1, M2, M3, FI, F2). 


Table 3. The obtained results for different datasets 
Mushroom  fflarel flare2 Monks! Monks2 Monks3 Zoo 


A.R 2654 3468 3342 2422 2516 2554 3564 
Skyrules 658 13 105 509 252 43 1596 
Our approach 730 1808 529 1635 2180 1748 512 


We produce the results of the test evaluation, which its purposes are multiple. Primary, we confirm 
through experiments that our method can significantly decrease the immense number of rules generated from 
the data sets. To approve our approach we compare it with another method of skyrules. These tests have the 
power to quantify the decrease of the rules offered by our method. Therefore, we compare the number of 
non-dominated rules of our method to the number of non-dominated rules of skyrule and to the total number 
of association rules (denoted AR). 

The skyrule strategy aims to identify undominated association rules without favoring or eliminating 
any interestingness measures using dominance relation. Table 3 compares the size of non-dominated rules of 
our approach with the rules of skyrules [9] and with all the association rules. Also giving the corresponding 
histograms for the table to illustrate the results in Figure 1. 


2000 
se i ALD I 
‘6 | _n fi 


Mushroom flare flare2 Monks1 Monks2 Monks3 
@A.R- @Skyrules Our approach 
Figure 1. The corresponding histogram of the results 
In order to investigate the execution of our recommended method, we have compared the average 
value of confidence in each dataset of our approach to the skyrule approach [9] and to the all association 
rules produced utilizing Apriori. The Figure 2 shows the obtained result of values of confidence in the 


different datasets. When interpreting this figure, it is clear that the rules obtained by our method have good 
qualities compared to other methods, which ensures the effectiveness of our proposed method. 


0.6 
0.4 
> A ll ll al | 
ie) 


Mushroom flare1 flare2 Monks1 Monks2 Monks3 


@A.R W@Skyrules ™ Our approach 
Figure 2. The histogram of the average of confidence 
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5. CONCLUSION 

In this paper, we introduced an approach utilizing MCDA for finding the interesting association 
rules. The principal benefit of the recommended approach is that it is not limited by the abundance of 
measures and it judges the association rules adopting a set of criteria, not only one. When the suggested 
algorithm is practiced to various datasets, we acquire results including desired rules with maximum 
interestingness. The numbers of rules generated by the recommended algorithm are significantly less as 
compared to skyrule method. Therefore, we can say our algorithm can overcome the problem of the 
abundance of AR and optimizes the association rule efficiently and adequately. As future activities, we 
intend to improve our approach to be capable to classify and rank the association rules. 
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